BTCC / BTCC Square / Global Cryptocurrency /
NVIDIA’s CUTLASS 3.x Enhances GEMM Kernel Design with Modular Abstractions

NVIDIA’s CUTLASS 3.x Enhances GEMM Kernel Design with Modular Abstractions

Published:
2025-07-17 15:29:02
20
1
BTCCSquare news:

NVIDIA has unveiled CUTLASS 3.x, a significant update to its CUDA Templates for Linear Algebra Subroutines and Solvers. The new version introduces a modular, hierarchical system for General Matrix Multiply (GEMM) kernel design, offering improved flexibility and performance across NVIDIA's GPU architectures.

The redesign emphasizes composable building blocks and template parameters, enabling developers to balance high-level abstractions with low-level customization. This approach caters to diverse hardware requirements while maintaining code readability—a critical factor for adoption in performance-sensitive environments.

Notably, CUTLASS 3.x extends support to NVIDIA's latest Hopper and Blackwell architectures, ensuring compatibility with cutting-edge GPU designs. The update reflects NVIDIA's continued investment in developer tools that bridge the gap between theoretical performance and practical implementation.

|Square

Get the BTCC app to start your crypto journey

Get started today Scan to join our 100M+ users